Git
git is a version control system developed by Linus Torvalds for managing large collaborative software projects. It is not limited to just software, however. git manages changes in files and stores these changes as commits. Each commit contains a description (that you write) of what you've changed, why, what it means etc. This is a great tool. If you've ever written a report and thougt "why did I remove that paragraph?" or "when did I add that?" With git, you can find out! In addition, it lets you "roll back" specific commits, so it becomes something like a persistent undo. Sounds great, right? It has a slight drawback in that git pretty much only works with text files. So office documents and spreadsheets won't have changes calculated properly, and partial rollback will probably only corrupt your file. git works well with plain text, so for source code, csv datafiles, and other data stored as text, git is fantastic. Further, git is a great tool for collaborative working in that it allows multiple people to, offline, work on the same set of files and then merge their changes together. If everything was committed properly, the changes will merge together with little hassle. Finally, git can be used as a sort of backup protocol. Git allows you to push changes to a remote server (or USB drive, or harddrive or...). Relevance Research requires collaboration, and git is a good way to manage this. The log of commit messages makes it great to annotate stores of data, and makes it difficult to lose files. How git works A directory (folder) which contains a bunch of files managed by git is called a git repository. A repository contains a ".git" sub-directory, where git stores all the meta-data about the repository. When you add files to git, they are stored in a list of git objects in this ".git" directory. This stores information about what the file contains, when it was modified and so on. When you change the file, the changes need to be stored in git. You track the changes by "committing" them. You do this by adding files to a commit, then writing a commit message. The changes are stored along with the message and your name so that in the future, the changes, who made them, and why are visible. Installing Windows Go to the git website and download the relevant installer. Once you've installed it, search for "git bash" in the start menu thingy. This should pop up with a radical hacker window called a terminal. This is where you enter commands and the computer will process them. You can access git from this terminal. Linux and macOS If you're on Linux, you should instead use your package manager (apt, yum, pkg, etc) and if you're on macOS, you should consider using homebrew. Once it's installed, can now access git using a terminal. Setting up git Before you can use git, you need to tell git a couple things. It needs to know your name and email address, so that when you track a change to a file, someone down the line can see who made that change (and email them to ask them about it). At the terminal enter: $ git config --global user.name "" Replacing with your full name. Then: $ git config --global user.email "" replacing with your email address. Creating a repository To create a new repository, navigate to the directory you'd like to track using git. $ cd /path/to/repo Then you need to initialise the repository. This creates the ".git" folder talked about above. $ git init Initialised empty Git repository in /path/to/repo The ".git" folder is now created, and git is ready to start tracking files! Let's check the status of our new repository. $ git status On branch master No commits yet Untracked files: (use "git add ..." to unstage) new file: IMPORTANT REPORT.tex Untracked files: (use "git add for help"), the latter very easy (commands are written on the bottom of the screen). Into the editor window you should enter a message describing what you changed. Here, we've just added a new file to the repository. When you "add" to a commit, you're not adding a file, but rather adding the difference between the file and what's in the git index to the git index. If the file does not exist in the index, you are adding a new file. Mostly, you'll just be adding changes. Adding a remote Once you've got your repository and some changes, you'll probably want some common location for you to access it from. Commonly repositories are stored on the web (see GitLab, or GitHub). These websites make it easy to create and manage repositories on the go. If, however, you want to store the repository on your own computer (or a USB stick, mounted network location, ...), you need to create a special type of repository in the remote location. Say your remote location is just another folder on your computer. Navigate to that folder: $ cd /path/to/remote Then you need to create the special type of repository, called a bare repository. This repository just contains the contents of the .git directory in a normal repository. $ git init --bare Then, go back to your repository. $ cd /path/to/repository You'll need to add the remote to this repository. If you're using GitLab or GitHub, then the path to remote is the URL for the repository you created there (e.g. https://github.com/cbosoft/syringepump). $ git remote add origin /path/to/remote Then push the information from the repo to the remote, while telling git you want this remote to be the main upstream repo from you (i.e. this is the remote you want to compare to when checking for updates). $ git push --set-upstream origin master Normally, when pushing changes, you can just do $ git push This is only for when adding a new remote. Cloning from a remote Now you've made changes and pushed to that remote. Now, you're in a new location and you want to work on the same project, but you don't have your computer with you. You can git clone a repository from remote to get your project directory on the new machine: $ git clone /path/to/remote If you're using one of the webservices, replace the path with the url and you'll be prompted for username and password. This command creates a copy of the repository on the new machine. Now you can work on it (making changes, committing them, and pushing to remote) as if you were on your home machine. You can work from anywhere with git! Pulling and merging from remote Aha! Now you're back home, and you need to get changes from remote that you made while on another machine. You could delete your folder and re clone the repo... but you've also got changes on this machine you didn't push before. If you deleted the repo and re-cloned, you'd lose them. To get changes from remote, while being wary of changes made locally, use: $ git pull This takes changes from the remote, downloads them to the local repository, then applies the changes. If there are any files which have conflicts (changed by remote and local both) then git will warn you and ask you to deal with the conflict manually. This just means you need to open the file, find this git conflict markers (a bunch of equals signs, less than signs, or greater than signs). This marks the point in the file where changes have occurred locally as well as in remote. You should fix the conflict, then save and close the file. To tell git you've fixed it, add the fix to the merge commit: $ git add Do this for all files that have conflicts (do git status to check what needs fixing), then commit the merge: $ git commit That's the basics for working with git! Issues Category:Programming Category:Data management